<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico">
<script>MathJax = {tex: {inlineMath: [["\\(", "\\)"]]}, svg: {fontCache: "global"}};</script>
<script type="text/javascript" id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-svg.js"></script>
<link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/11.4.0/styles/default.min.css">
<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/11.4.0/highlight.min.js"></script>
<script>hljs.highlightAll();</script>
<script src="../static/mccole.js"></script>
<script>window.onload = () => fixPage()</script>

    <script defer data-domain="stjs.tech" src="https://plausible.io/js/plausible.js"></script>

    <link href="../static/mccole.css" rel="stylesheet" type="text/css">
    <title>Software Design by Example: <h1 id="virtual-machine">Virtual Machine</h1></title>
  </head>
  <body>
    <main>
      <p class="home"><a href="../">Software Design by Example</a></p>
      <h1 id="virtual-machine">Virtual Machine</h1>
      <div class="lede"><p>Assembling and running low-level code</p></div>
      <nav>
	<ol class="toc">
<li><a href="#virtual-machine-arch">What is the architecture of our virtual machine?</a></li>
<li><a href="#virtual-machine-execute">How can we execute these instructions?</a></li>
<li><a href="#virtual-machine-assembly">What do assembly programs look like?</a></li>
<li><a href="#virtual-machine-data">How can we store data?</a></li>
<li><a href="#virtual-machine-exercises">Exercises</a></li>
</ol>

      </nav>
      <div class="keyterms"><p><a href="../glossary/#abi">Application Binary Interface</a>, <a href="../glossary/#assembler">assembler</a>, <a href="../glossary/#assembly_code">assembly code</a>, <a href="../glossary/#bitwise_operation">bitwise operation</a>, <a href="../glossary/#disassembler">disassembler</a>, <a href="../glossary/#instruction_pointer">instruction pointer</a>, <a href="../glossary/#instruction_set">instruction set</a>, <a href="../glossary/#label_address">label (address in memory)</a>, <a href="../glossary/#op_code">op code</a>, <a href="../glossary/#register">register</a>, <a href="../glossary/#virtual_machine">virtual machine</a>, <a href="../glossary/#word_memory">word (of memory)</a></p></div>

      
<p>Computers don't execute JavaScript directly.
Instead,
each processor has its own <span g="instruction_set" i="instruction set">instruction set</span>,
and a compiler translates high-level languages into those instructions.
Compilers often use an intermediate representation called <span g="assembly_code" i="assembly code">assembly code</span>
that gives instructions human-readable names instead of numbers.
To understand more about how JavaScript actually runs
we will simulate a very simple processor with a little bit of memory.
If you want to dive deeper,
have a look at <span i="Nystrom, Bob"><a href="http://journal.stuffwithstuff.com/">Bob Nystrom's</a></span> <em><a href="https://craftinginterpreters.com/">Crafting Interpreters</a></em>.
You may also enjoy <span i="Human Resource Machine"><a href="https://tomorrowcorporation.com/humanresourcemachine">Human Resource Machine</a></span>,
which asks you to solve puzzles of increasing difficulty
using a processor almost as simple as ours.</p>
<h2 id="virtual-machine-arch">What is the architecture of our virtual machine?</h2>
<p>Our <span g="virtual_machine" i="virtual machine">virtual machine</span> has three parts,
which are shown in <a class="figref" href="#virtual-machine-architecture">Figure&nbsp;19.1</a>
for a program made up of 110 instructions:</p>
<ol>
<li>
<p>An <span g="instruction_pointer" i="instruction pointer">instruction pointer</span> (IP)
that holds the memory address of the next instruction to execute.
It is automatically initialized to point at address 0,
which is where every program must start.
This rule is part of the <span g="abi" i="Application Binary Interface">Application Binary Interface</span> (ABI)
for our virtual machine.</p>
</li>
<li>
<p>Four <span g="register" i="register (in computer)">registers</span> named R0 to R4 that instructions can access directly.
There are no memory-to-memory operations in our VM:
everything  happens in or through registers.</p>
</li>
<li>
<p>256 <span g="word_memory">words</span> of memory, each of which can store a single value.
Both the program and its data live in this single block of memory;
we chose the size 256 so that each address will fit in a single byte.</p>
</li>
</ol>
<figure id="virtual-machine-architecture">
  <img src="figures/architecture.svg" alt="Virtual machine architecture" />
  <figcaption>Figure&nbsp;19.1: Architecture of the virtual machine.</figcaption>
</figure>
<p>The instructions for our VM are 3 bytes long.
The <span g="op_code" i="op code; virtual machine!op code">op code</span> fits into one byte,
and each instruction may optionally include one or two single-byte operands.
Each operand is a register identifier,
a constant,
or an address
(which is just a constant that identifies a location in memory);
since constants have to fit in one byte,
the largest number we can represent directly is 256.
<a class="tblref" href="#virtual-machine-op-codes">Table&nbsp;19.1</a> uses the letters <code>r</code>, <code>c</code>, and <code>a</code>
to indicate instruction format,
where <code>r</code> indicates a register identifier,
<code>c</code> indicates a constant,
and <code>a</code> indicates an address.</p>
<table id="virtual-machine-op-codes"><caption>Table&nbsp;19.1: Virtual machine op codes.</caption>
<thead>
<tr>
<th>Instruction</th>
<th>Code</th>
<th>Format</th>
<th>Action</th>
<th>Example</th>
<th>Equivalent</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>hlt</code></td>
<td>1</td>
<td><code>--</code></td>
<td>Halt program</td>
<td><code>hlt</code></td>
<td><code>process.exit(0)</code></td>
</tr>
<tr>
<td><code>ldc</code></td>
<td>2</td>
<td><code>rc</code></td>
<td>Load immediate</td>
<td><code>ldc R0 123</code></td>
<td><code>R0 := 123</code></td>
</tr>
<tr>
<td><code>ldr</code></td>
<td>3</td>
<td><code>rr</code></td>
<td>Load register</td>
<td><code>ldr R0 R1</code></td>
<td><code>R0 := RAM[R1]</code></td>
</tr>
<tr>
<td><code>cpy</code></td>
<td>4</td>
<td><code>rr</code></td>
<td>Copy register</td>
<td><code>cpy R0 R1</code></td>
<td><code>R0 := R1</code></td>
</tr>
<tr>
<td><code>str</code></td>
<td>5</td>
<td><code>rr</code></td>
<td>Store register</td>
<td><code>str R0 R1</code></td>
<td><code>RAM[R1] := R0</code></td>
</tr>
<tr>
<td><code>add</code></td>
<td>6</td>
<td><code>rr</code></td>
<td>Add</td>
<td><code>add R0 R1</code></td>
<td><code>R0 := R0 + R1</code></td>
</tr>
<tr>
<td><code>sub</code></td>
<td>7</td>
<td><code>rr</code></td>
<td>Subtract</td>
<td><code>sub R0 R1</code></td>
<td><code>R0 := R0 - R1</code></td>
</tr>
<tr>
<td><code>beq</code></td>
<td>8</td>
<td><code>ra</code></td>
<td>Branch if equal</td>
<td><code>beq R0 123</code></td>
<td><code>if (R0 === 0) PC := 123</code></td>
</tr>
<tr>
<td><code>bne</code></td>
<td>9</td>
<td><code>ra</code></td>
<td>Branch if not equal</td>
<td><code>bne R0 123</code></td>
<td><code>if (R0 !== 0) PC := 123</code></td>
</tr>
<tr>
<td><code>prr</code></td>
<td>10</td>
<td><code>r-</code></td>
<td>Print register</td>
<td><code>prr R0</code></td>
<td><code>console.log(R0)</code></td>
</tr>
<tr>
<td><code>prm</code></td>
<td>11</td>
<td><code>r-</code></td>
<td>Print memory</td>
<td><code>prm R0</code></td>
<td><code>console.log(RAM[R0])</code></td>
</tr>
</tbody>
</table>
<p>We put our VM's architectural details in a file
that can be shared by other components:</p>
<pre title="architecture.js"><code class="language-js">const OPS = {
  hlt: { code:  1, fmt: '--' }, // Halt program
  ldc: { code:  2, fmt: 'rv' }, // Load immediate
  ldr: { code:  3, fmt: 'rr' }, // Load register
  cpy: { code:  4, fmt: 'rr' }, // Copy register
  str: { code:  5, fmt: 'rr' }, // Store register
  add: { code:  6, fmt: 'rr' }, // Add
  sub: { code:  7, fmt: 'rr' }, // Subtract
  beq: { code:  8, fmt: 'rv' }, // Branch if equal
  bne: { code:  9, fmt: 'rv' }, // Branch if not equal
  prr: { code: 10, fmt: 'r-' }, // Print register
  prm: { code: 11, fmt: 'r-' }  // Print memory
}

const OP_MASK = 0xFF // select a single byte
const OP_SHIFT = 8   // shift up by one byte
const OP_WIDTH = 6   // op width in characters when printing

const NUM_REG = 4    // number of registers
const RAM_LEN = 256  // number of words in RAM

export {
  OPS,
  OP_MASK,
  OP_SHIFT,
  OP_WIDTH,
  NUM_REG,
  RAM_LEN
}
</code></pre>
<!-- continue -->
<p>While there isn't a name for this design pattern,
putting all the constants that define a system in one file
instead of scattering them across multiple files
makes them easier to find as well as ensuring consistency.</p>
<h2 id="virtual-machine-execute">How can we execute these instructions?</h2>
<p>As in previous chapters,
we will split a class that would normally be written in one piece into several parts for exposition.
We start by defining a class with an instruction pointer, some registers, and some memory
along with a prompt for output:</p>
<pre title="vm-base.js"><code class="language-js">import assert from 'assert'

import {
  OP_MASK,
  OP_SHIFT,
  NUM_REG,
  RAM_LEN
} from './architecture.js'

const COLUMNS = 4
const DIGITS = 8

class VirtualMachineBase {
  constructor () {
    this.ip = 0
    this.reg = Array(NUM_REG)
    this.ram = Array(RAM_LEN)
    this.prompt = '&gt;&gt;'
  }

}

export default VirtualMachineBase
</code></pre>
<p>A program is just an array of numbers representing instructions.
To load one,
we copy those numbers into memory and reset the instruction pointer and registers:</p>
<pre title="vm-base.js"><code class="language-js">  initialize (program) {
    assert(program.length &lt;= this.ram.length,
      'Program is too long for memory')
    for (let i = 0; i &lt; this.ram.length; i += 1) {
      if (i &lt; program.length) {
        this.ram[i] = program[i]
      } else {
        this.ram[i] = 0
      }
    }
    this.ip = 0
    this.reg.fill(0)
  }
</code></pre>
<p>In order to handle the next instruction,
the VM gets the value in memory that the instruction pointer currently refers to
and moves the instruction pointer on by one address.
It then uses <span g="bitwise_operation" i="bitwise operation">bitwise operations</span>
to extract the op code and operands from the instruction
(<a class="figref" href="#virtual-machine-unpacking">Figure&nbsp;19.2</a>):</p>
<pre title="vm-base.js"><code class="language-js">  fetch () {
    assert((0 &lt;= this.ip) &amp;&amp; (this.ip &lt; RAM_LEN),
      `Program counter ${this.ip} out of range 0..${RAM_LEN}`)
    let instruction = this.ram[this.ip]
    this.ip += 1
    const op = instruction &amp; OP_MASK
    instruction &gt;&gt;= OP_SHIFT
    const arg0 = instruction &amp; OP_MASK
    instruction &gt;&gt;= OP_SHIFT
    const arg1 = instruction &amp; OP_MASK
    return [op, arg0, arg1]
  }
</code></pre>
<figure id="virtual-machine-unpacking">
  <img src="figures/unpacking.svg" alt="Unpacking instructions" />
  <figcaption>Figure&nbsp;19.2: Using bitwise operations to unpack instructions.</figcaption>
</figure>
<blockquote>
<h3>Semi-realistic</h3>
<p>We always unpack two operands regardless of whether the instructions has them or not,
since this is what a hardware implementation would be.
We have also included assertions in our VM
to simulate the way that real hardware includes logic
to detect illegal instructions and out-of-bound memory addresses.</p>
</blockquote>
<p>The next step is to extend our base class with one that has a <code>run</code> method.
As its name suggests,
this runs the program by fetching instructions and executing them until told to stop:</p>
<pre title="vm.js"><code class="language-js">import assert from 'assert'

import {
  OPS
} from './architecture.js'

import VirtualMachineBase from './vm-base.js'

class VirtualMachine extends VirtualMachineBase {
  run () {
    let running = true
    while (running) {
      const [op, arg0, arg1] = this.fetch()
      switch (op) {
        case OPS.hlt.code:
          running = false
          break

        case OPS.ldc.code:
          this.assertIsRegister(arg0, op)
          this.reg[arg0] = arg1
          break


        default:
          assert(false, `Unknown op ${op}`)
          break
      }
    }
  }

  assertIsRegister (reg) {
    assert((0 &lt;= reg) &amp;&amp; (reg &lt; this.reg.length),
      `Invalid register ${reg}`)
  }

  assertIsAddress (addr) {
    assert((0 &lt;= addr) &amp;&amp; (addr &lt; this.ram.length),
      `Invalid register ${addr}`)
  }
}

export default VirtualMachine
</code></pre>
<p>Some instructions are very similar to others,
so we will only look at three here.
The first stores the value of one register in the address held by another register:</p>
<pre title="vm.js"><code class="language-js">        case OPS.str.code:
          this.assertIsRegister(arg0, op)
          this.assertIsRegister(arg1, op)
          this.assertIsAddress(this.reg[arg1], op)
          this.ram[this.reg[arg1]] = this.reg[arg0]
          break
</code></pre>
<!-- continue -->
<p>The first three lines check that the operation is legal;
the fourth one uses the value in one register as an address,
which is why it has nested array indexing.</p>
<p>Adding the value in one register to the value in another register is simpler:</p>
<pre title="vm.js"><code class="language-js">        case OPS.add.code:
          this.assertIsRegister(arg0, op)
          this.assertIsRegister(arg1, op)
          this.reg[arg0] += this.reg[arg1]
          break
</code></pre>
<!-- continue -->
<p>as is jumping to a fixed address if the value in a register is zero:</p>
<pre title="vm.js"><code class="language-js">        case OPS.beq.code:
          this.assertIsRegister(arg0, op)
          this.assertIsAddress(arg1, op)
          if (this.reg[arg0] === 0) {
            this.ip = arg1
          }
          break
</code></pre>
<h2 id="virtual-machine-assembly">What do assembly programs look like?</h2>
<p>We could figure out numerical op codes by hand,
and in fact that's what <a href="http://eniacprogrammers.org/">the first programmers</a> did.
However,
it is much easier to use an <span g="assembler" i="assembler">assembler</span>,
which is just a small compiler for a language that very closely represents actual machine instructions.</p>
<p>Each command in our assembly languages matches an instruction in the VM.
Here's an assembly language program to print the value stored in R1 and then halt:</p>
<pre title="print-r1.as"><code class="language-as"># Print initial contents of R1.
prr R1
hlt
</code></pre>
<!-- continue -->
<p>Its numeric representation is:</p>
<pre title="print-r1.mx"><code class="language-mx">00010a
000001
</code></pre>
<p>One thing the assembly language has that the instruction set doesn't
is <span g="label_address" i="label (on address)">labels on addresses</span>.
The label <code>loop</code> doesn't take up any space;
instead,
it tells the assembler to give the address of the next instruction a name
so that we can refer to that address as <code>@loop</code> in jump instructions.
For example,
this program prints the numbers from 0 to 2
(<a class="figref" href="#virtual-machine-count-up">Figure&nbsp;19.3</a>):</p>
<pre title="count-up.as"><code class="language-as"># Count up to 3.
# - R0: loop index.
# - R1: loop limit.
ldc R0 0
ldc R1 3
loop:
prr R0
ldc R2 1
add R0 R2
cpy R2 R1
sub R2 R0
bne R2 @loop
hlt
</code></pre>


<pre title="count-up.mx"><code class="language-mx">000002
030102
00000a
010202
020006
010204
000207
020209
000001
</code></pre>
<figure id="virtual-machine-count-up">
  <img src="figures/count-up.svg" alt="Counting from 0 to 2" />
  <figcaption>Figure&nbsp;19.3: Flowchart of assembly language program to count up from 0 to 2.</figcaption>
</figure>
<p>Let's trace this program's execution
(<a class="figref" href="#virtual-machine-trace-counter">Figure&nbsp;19.4</a>):</p>
<ol>
<li>R0 holds the current loop index.</li>
<li>R1 holds the loop's upper bound (in this case 3).</li>
<li>The loop prints the value of R0 (one instruction).</li>
<li>The program adds 1 to R0.
This takes two instructions because we can only add register-to-register.</li>
<li>It checks to see if we should loop again,
which takes three instructions.</li>
<li>If the program <em>doesn't</em> jump back, it halts.</li>
</ol>
<figure id="virtual-machine-trace-counter">
  <img src="figures/trace-counter.svg" alt="Trace counting program" />
  <figcaption>Figure&nbsp;19.4: Tracing registers and memory values for a simple counting program.</figcaption>
</figure>
<p>The implementation of the assembler mirrors the simplicity of assembly language.
The main method gets interesting lines,
finds the addresses of labels,
and turns each remaining line into an instruction:</p>
<pre title="assembler.js"><code class="language-js">  assemble (lines) {
    lines = this.cleanLines(lines)
    const labels = this.findLabels(lines)
    const instructions = lines.filter(line =&gt; !this.isLabel(line))
    const compiled = instructions.map(instr =&gt; this.compile(instr, labels))
    const program = this.instructionsToText(compiled)
    return program
  }

  cleanLines (lines) {
    return lines
      .map(line =&gt; line.trim())
      .filter(line =&gt; line.length &gt; 0)
      .filter(line =&gt; !this.isComment(line))
  }

  isComment (line) {
    return line.startsWith('#')
  }
</code></pre>
<p>To find labels,
we go through the lines one by one
and either save the label <em>or</em> increment the current address
(because labels don't take up space):</p>
<pre title="assembler.js"><code class="language-js">  findLabels (lines) {
    const result = {}
    let index = 0
    lines.forEach(line =&gt; {
      if (this.isLabel(line)) {
        const label = line.slice(0, -1)
        assert(!(label in result),
          `Duplicate label ${label}`)
        result[label] = index
      } else {
        index += 1
      }
    })
    return result
  }

  isLabel (line) {
    return line.endsWith(':')
  }
</code></pre>
<p>To compile a single instruction we break the line into tokens,
look up the format for the operands,
and pack them into a single value:</p>
<pre title="assembler.js"><code class="language-js">  compile (instruction, labels) {
    const [op, ...args] = instruction.split(/\s+/)
    assert(op in OPS,
      `Unknown operation &quot;${op}&quot;`)
    let result = 0
    switch (OPS[op].fmt) {
      case '--':
        result = this.combine(
          OPS[op].code
        )
        break
      case 'r-':
        result = this.combine(
          this.register(args[0]),
          OPS[op].code
        )
        break
      case 'rr':
        result = this.combine(
          this.register(args[1]),
          this.register(args[0]),
          OPS[op].code
        )
        break
      case 'rv':
        result = this.combine(
          this.value(args[1], labels),
          this.register(args[0]),
          OPS[op].code
        )
        break
      default:
        assert(false,
          `Unknown instruction format ${OPS[op].fmt}`)
    }
    return result
  }
</code></pre>
<p>Combining op codes and operands into a single value
is the reverse of the unpacking done by the virtual machine:</p>
<pre title="assembler.js"><code class="language-js">  combine (...args) {
    assert(args.length &gt; 0,
      'Cannot combine no arguments')
    let result = 0
    for (const a of args) {
      result &lt;&lt;= OP_SHIFT
      result |= a
    }
    return result
  }
</code></pre>
<p>Finally, we need few utility functions:</p>
<pre title="assembler.js"><code class="language-js">  instructionsToText (program) {
    return program.map(op =&gt; op.toString(16).padStart(OP_WIDTH, '0'))
  }

  register (token) {
    assert(token[0] === 'R',
      `Register &quot;${token}&quot; does not start with 'R'`)
    const r = parseInt(token.slice(1))
    assert((0 &lt;= r) &amp;&amp; (r &lt; NUM_REG),
      `Illegal register ${token}`)
    return r
  }

  value (token, labels) {
    if (token[0] !== '@') {
      return parseInt(token)
    }
    const labelName = token.slice(1)
    assert(labelName in labels,
      `Unknown label &quot;${token}&quot;`)
    return labels[labelName]
  }
</code></pre>
<p>Let's try assembling a program and display its output,
the registers,
and the interesting contents of memory.
As a test,
this program counts up to three:</p>
<pre title="count-up.as"><code class="language-as"># Count up to 3.
# - R0: loop index.
# - R1: loop limit.
ldc R0 0
ldc R1 3
loop:
prr R0
ldc R2 1
add R0 R2
cpy R2 R1
sub R2 R0
bne R2 @loop
hlt
</code></pre>
<h2 id="virtual-machine-data">How can we store data?</h2>
<p>It is tedious to write interesting programs when each value needs a unique name.
We can do a lot more once we have collections like <span i="array!implementation of">arrays</span>,
so let's add those to our assembler.
We don't have to make any changes to the virtual machine,
which doesn't care if we think of a bunch of numbers as individuals or elements of an array,
but we do need a way to create arrays and refer to them.</p>
<p>We will allocate storage for arrays at the end of the program
by using <code>.data</code> on a line of its own to mark the start of the data section
and then <code>label: number</code> to give a region a name and allocate some storage space
(<a class="figref" href="#virtual-machine-storage-allocation">Figure&nbsp;19.5</a>).</p>
<figure id="virtual-machine-storage-allocation">
  <img src="figures/storage-allocation.svg" alt="Storage allocation" />
  <figcaption>Figure&nbsp;19.5: Allocating storage for arrays in the virtual machine.</figcaption>
</figure>
<p>This enhancement only requires a few changes to the assembler.
First,
we need to split the lines into instructions and data allocations:</p>
<pre title="allocate-data.js"><code class="language-js">  assemble (lines) {
    lines = this.cleanLines(lines)
    const [toCompile, toAllocate] = this.splitAllocations(lines)
    const labels = this.findLabels(lines)
    const instructions = toCompile.filter(line =&gt; !this.isLabel(line))
    const baseOfData = instructions.length
    this.addAllocations(baseOfData, labels, toAllocate)
    const compiled = instructions.map(instr =&gt; this.compile(instr, labels))
    const program = this.instructionsToText(compiled)
    return program
  }
</code></pre>
<pre title="allocate-data.js"><code class="language-js">  splitAllocations (lines) {
    const split = lines.indexOf(DIVIDER)
    if (split === -1) {
      return [lines, []]
    } else {
      return [lines.slice(0, split), lines.slice(split + 1)]
    }
  }
</code></pre>
<p>Second,
we need to figure out where each allocation lies and create a label accordingly:</p>
<pre title="allocate-data.js"><code class="language-js">  addAllocations (baseOfData, labels, toAllocate) {
    toAllocate.forEach(alloc =&gt; {
      const fields = alloc.split(':').map(a =&gt; a.trim())
      assert(fields.length === 2,
        `Invalid allocation directive &quot;${alloc}&quot;`)
      const [label, numWordsText] = fields
      assert(!(label in labels),
        `Duplicate label &quot;${label}&quot; in data allocation`)
      const numWords = parseInt(numWordsText)
      assert((baseOfData + numWords) &lt; RAM_LEN,
        `Allocation &quot;${label}&quot; requires too much memory`)
      labels[label] = baseOfData
      baseOfData += numWords
    })
  }
</code></pre>
<p>And that's it:
no other changes are needed to either compilation or execution.
To test it,
let's fill an array with the numbers from 0 to 3:</p>
<pre title="fill-array.as"><code class="language-as"># Count up to 3.
# - R0: loop index.
# - R1: loop limit.
# - R2: array index.
# - R3: temporary.
ldc R0 0
ldc R1 3
ldc R2 @array
loop:
str R0 R2
ldc R3 1
add R0 R3
add R2 R3
cpy R3 R1
sub R3 R0
bne R3 @loop
hlt
.data
array: 10
</code></pre>
<blockquote>
<h3>How does it actually work?</h3>
<p>Our VM is just another program.
If you'd like to know what happens when instructions finally meet hardware,
and how electrical circuits are able to do arithmetic,
make decisions,
and talk to the world,
[<a href="../bibliography/#Patterson2017">Patterson2017</a>] has everything you want to know and more.</p>
</blockquote>
<h2 id="virtual-machine-exercises">Exercises</h2>
<h3 class="exercise">Swapping values</h3>
<p>Write an assembly language program that swaps the values in R1 and R2
without affecting the values in other registers.</p>
<h3 class="exercise">Reversing an array</h3>
<p>Write an assembly language program that starts with:</p>
<ul>
<li>the base address of an array in one word</li>
<li>the length of the array N in the next word</li>
<li>N values immediately thereafter</li>
</ul>
<!-- continue -->
<p>and reverses the array in place.</p>
<h3 class="exercise">Increment and decrement</h3>
<ol>
<li>
<p>Add instructions <code>inc</code> and <code>dec</code> that add one to the value of a register
and subtract one from the value of a register respectively.</p>
</li>
<li>
<p>Rewrite the examples to use these instructions.
How much shorter do they make the programs?
How much easier to read?</p>
</li>
</ol>
<h3 class="exercise">Using long addresses</h3>
<ol>
<li>
<p>Modify the virtual machine so that the <code>ldr</code> and <code>str</code> instructions
contain 16-bit addresses rather than 8-bit addresses
and increase the virtual machine's memory to 64K words to match.</p>
</li>
<li>
<p>How does this complicate instruction interpretation?</p>
</li>
</ol>
<h3 class="exercise">Operating on strings</h3>
<p>The C programming language stored character strings as non-zero bytes terminated by a byte containing zero.</p>
<ol>
<li>
<p>Write a program that starts with the base address of a string in R1
and finishes with the length of the string (not including the terminator) in the same register.</p>
</li>
<li>
<p>Write a program that starts with the base address of a string in R1
and the base address of some other block of memory in R2
and copies the string to that new location (including the terminator).</p>
</li>
<li>
<p>What happens in each case if the terminator is missing?</p>
</li>
</ol>
<h3 class="exercise">Call and return</h3>
<ol>
<li>
<p>Add another register to the virtual machine called SP (for &quot;stack pointer&quot;)
that is automatically initialized to the <em>last</em> address in memory.</p>
</li>
<li>
<p>Add an instruction <code>psh</code> (short for &quot;push&quot;) that copies a value from a register
to the address stored in SP and then subtracts one from SP.</p>
</li>
<li>
<p>Add an instruction <code>pop</code> (short for &quot;pop&quot;) that adds one to SP
and then copies a value from that address into a register.</p>
</li>
<li>
<p>Using these instructions,
write a subroutine that evaluates <code>2x+1</code> for every value in an array.</p>
</li>
</ol>
<h3 class="exercise">Disassembling instructions</h3>
<p>A <span g="disassembler">disassembler</span> turns machine instructions into assembly code.
Write a disassembler for the instruction set used by our virtual machine.
(Since the labels for addresses are not stored in machine instructions,
disassemblers typically generate labels like <code>@L001</code> and <code>@L002</code>.)</p>
<h3 class="exercise">Linking multiple files</h3>
<ol>
<li>
<p>Modify the assembler to handle <code>.include filename</code> directives.</p>
</li>
<li>
<p>What does your modified assembler do about duplicate label names?
How does it prevent infinite includes
(i.e., <code>A.as</code> includes <code>B.as</code> which includes <code>A.as</code> again)?</p>
</li>
</ol>
<h3 class="exercise">Providing system calls</h3>
<p>Modify the virtual machine so that developers can add &quot;system calls&quot; to it.</p>
<ol>
<li>
<p>On startup,
the virtual machine loads an array of functions defined in a file called <code>syscalls.js</code>.</p>
</li>
<li>
<p>The <code>sys</code> instruction takes a one-byte constant argument.
It looks up the corresponding function and calls it with the values of R0-R3 as parameters
and places the result in R0.</p>
</li>
</ol>
<h3 class="exercise">Unit testing</h3>
<ol>
<li>
<p>Write unit tests for the assembler.</p>
</li>
<li>
<p>Once they are working,
write unit tests for the virtual machine.</p>
</li>
</ol>

    </main>
    <footer>
  <hr/>
  <p>
    Copyright © 2022 Greg Wilson
    &nbsp;
    <a href="../license/"><img class="icon" src="../static/cc-by-nc.svg" alt="CC-BY-NC icon"/></a>
    <a href="../license/"><img class="icon" src="../static/hippocratic.svg" alt="Hippocratic License icon"/></a>
    <a href="https://github.com/software-tools-books/stjs/"><img class="icon" src="../static/github.svg" alt="GitHub icon"/></a>
    <a href="mailto:gvwilson@third-bit.com"><img class="icon" src="../static/email.svg" alt="email icon"/></a>
  </p>
</footer>

  </body>
</html>
